智能论文笔记

DA-CIL: Towards Domain Adaptive Class-Incremental 3D Object Detection

Ziyuan Zhao , Mingxi Xu , Peisheng Qian , Ramanpreet Singh Pahwa , Richard Chang

分类：计算机视觉 | 人工智能

2022-12-05

Deep learning has achieved notable success in 3D object detection with the advent of large-scale point cloud datasets. However, severe performance degradation in the past trained classes, i.e., catastrophic forgetting, still remains a critical issue for real-world deployment when the number of classes is unknown or may vary. Moreover, existing 3D class-incremental detection methods are developed for the single-domain scenario, which fail when encountering domain shift caused by different datasets, varying environments, etc. In this paper, we identify the unexplored yet valuable scenario, i.e., class-incremental learning under domain shift, and propose a novel 3D domain adaptive class-incremental object detection framework, DA-CIL, in which we design a novel dual-domain copy-paste augmentation method to construct multiple augmented domains for diversifying training distributions, thereby facilitating gradual domain adaptation. Then, multi-level consistency is explored to facilitate dual-teacher knowledge distillation from different domains for domain adaptive class-incremental learning. Extensive experiments on various datasets demonstrate the effectiveness of the proposed method over baselines in the domain adaptive class-incremental learning scenario.

translated by 谷歌翻译

AIM 2022 Challenge on Super-Resolution of Compressed Image and Video: Dataset, Methods and Results

Ren Yang , Radu Timofte , Xin Li , Qi Zhang , Lin Zhang , Fanglong Liu , Dongliang He , Fu li , He Zheng , Weihang Yuan

分类：计算机视觉

2022-08-23

本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率，轨迹〜2靶向压缩视频的超分辨率。在轨道1中，我们使用流行的数据集DIV2K作为培训，验证和测试集。在轨道2中，我们提出了LDV 3.0数据集，其中包含365个视频，包括LDV 2.0数据集（335个视频）和30个其他视频。在这一挑战中，有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。

translated by 谷歌翻译

Regularized Contrastive Learning of Semantic Search

Mingxi Tan , Alexis Rolland , Andong Tian

分类：机器学习

2022-09-27

语义搜索是一项重要的任务，目的是从数据库中找到相关索引以进行查询。它需要一个可以正确学习句子语义的检索模型。基于变压器的模型由于其出色的学习语义表示能力而被广泛用作检索模型。同时，还提出了许多适合它们的正则化方法。在本文中，我们提出了一种新的正则化方法：正则化对比度学习，可以帮助基于变压器的模型学习更好地表示句子。首先，它为每个句子增强了几个不同的语义表示，然后将它们作为监管机构的对比目标。这些对比调节器可以克服过度拟合的问题并减轻各向异性问题。我们首先使用优于预训练的模型Sroberta对7个语义搜索基准测试进行评估。结果表明，我们的方法更有效地学习了出色的句子表示。然后，我们评估具有长期查询和索引的2个具有挑战性的FAQ数据集，咳嗽和FAQIR。我们的实验结果表明，我们的方法表现优于基线方法。

translated by 谷歌翻译

Regularized Soft Actor-Critic for Behavior Transfer Learning

Mingxi Tan , Andong Tian , Ludovic Denoyer

分类：机器学习

2022-09-27

现有的模仿学习方法主要集中于使代理有效地模仿一种表现出的行为，但并未解决行为方式与任务目标之间的潜在矛盾。普遍缺乏有效的方法，使代理可以在完成任务的主要目标的同时部分模仿不同程度的演示行为。在本文中，我们提出了一种称为正规软批评的方法，该方法在受约束的马尔可夫决策过程框架（CMDP）下制定了主要任务和模仿任务。主要任务定义为软性参数（SAC）中使用的最大熵目标，模仿任务定义为约束。我们评估了与视频游戏应用程序相关的连续控制任务的方法。

translated by 谷歌翻译

On-Robot Learning With Equivariant Models

Dian Wang , Mingxi Jia , Xupeng Zhu , Robin Walters , Robert Platt

分类：机器人

2022-03-09

最近，已证明模型的神经网络模型可以提高计算机视觉和增强学习任务的样本效率。本文在机器人策略学习的背景下探讨了这一想法，在这种情况下，必须完全在物理机器人系统上学习策略，而无需参考模型，模拟器或离线数据集。我们专注于模棱两可的SAC在机器人操作中的应用，并探索算法的许多变化。最终，我们证明了通过在不到一小时或两个小时的壁时钟时间内的机上体验完全学习几项非平凡操纵任务的能力。

translated by 谷歌翻译

Few-Sample Traffic Prediction with Graph Networks using Locale as Relational Inductive Biases

Mingxi Li , Yihong Tang , Wei Ma

分类：机器学习

2022-03-08

Accurate short-term traffic prediction plays a pivotal role in various smart mobility operation and management systems. Currently, most of the state-of-the-art prediction models are based on graph neural networks (GNNs), and the required training samples are proportional to the size of the traffic network. In many cities, the available amount of traffic data is substantially below the minimum requirement due to the data collection expense. It is still an open question to develop traffic prediction models with a small size of training data on large-scale networks. We notice that the traffic states of a node for the near future only depend on the traffic states of its localized neighborhoods, which can be represented using the graph relational inductive biases. In view of this, this paper develops a graph network (GN)-based deep learning model LocaleGN that depicts the traffic dynamics using localized data aggregating and updating functions, as well as the node-wise recurrent neural networks. LocaleGN is a light-weighted model designed for training on few samples without over-fitting, and hence it can solve the problem of few-sample traffic prediction. The proposed model is examined on predicting both traffic speed and flow with six datasets, and the experimental results demonstrate that LocaleGN outperforms existing state-of-the-art baseline models. It is also demonstrated that the learned knowledge from LocaleGN can be transferred across cities. The research outcomes can help to develop light-weighted traffic prediction systems, especially for cities lacking historically archived traffic data.

translated by 谷歌翻译

Trust-aware Control for Intelligent Transportation Systems

Mingxi Cheng , Junyao Zhang , Shahin Nazarian , Jyotirmoy Deshmukh , Paul Bogdan

分类：人工智能

2021-11-08

许多智能交通系统是多种代理系统，即交通参与者和运输基础设施内的子系统都可以被建模为互动代理。使用基于AI的方法在不同的代理系统之间实现协调可以提供更好的安全系统，这些运输系统仅包含人类操作车辆的运输系统，并在交通吞吐量，传感范围和启用协作任务方面提高系统效率。然而，增加的自主权使运输基础设施容易受到损害的车辆代理或基础设施。本文通过将信托权限嵌入运输基础设施来系统地量化称为主观逻辑的认知逻辑来系统地量化代理商的可信度来提出新的框架。在本文中，我们提出了以下新的贡献：（i）我们提出了一个框架，以利用代理商的量化可靠性来实现信任感知的协调和控制。（ii）我们展示如何使用基于强化学习的方法来综合信任感知控制器。（iii）我们全面分析了自主交叉口管理（AIM）案例研究，并制定了一个名为AIM-Trust的信任知识版本，导致在由可信和不受信任的代理商的混合中的情景中导致事故率降低。

translated by 谷歌翻译

Reference Twice: A Simple and Unified Baseline for Few-Shot Instance Segmentation

Yue Han , Jiangning Zhang , Zhucun Xue , Chao Xu , Xintian Shen , Yabiao Wang , Chengjie Wang , Yong Liu , Xiangtai Li

分类：计算机视觉

2023-01-03

Few Shot Instance Segmentation (FSIS) requires models to detect and segment novel classes with limited several support examples. In this work, we explore a simple yet unified solution for FSIS as well as its incremental variants, and introduce a new framework named Reference Twice (RefT) to fully explore the relationship between support/query features based on a Transformer-like framework. Our key insights are two folds: Firstly, with the aid of support masks, we can generate dynamic class centers more appropriately to re-weight query features. Secondly, we find that support object queries have already encoded key factors after base training. In this way, the query features can be enhanced twice from two aspects, i.e., feature-level and instance-level. In particular, we firstly design a mask-based dynamic weighting module to enhance support features and then propose to link object queries for better calibration via cross-attention. After the above steps, the novel classes can be improved significantly over our strong baseline. Additionally, our new framework can be easily extended to incremental FSIS with minor modification. When benchmarking results on the COCO dataset for FSIS, gFSIS, and iFSIS settings, our method achieves a competitive performance compared to existing approaches across different shots, e.g., we boost nAP by noticeable +8.2/+9.4 over the current state-of-the-art FSIS method for 10/30-shot. We further demonstrate the superiority of our approach on Few Shot Object Detection. Code and model will be available.

translated by 谷歌翻译

AI in HCI Design and User Experience

Wei Xu

分类：人工智能

2023-01-03

In this chapter, we review and discuss the transformation of AI technology in HCI/UX work and assess how AI technology will change how we do the work. We first discuss how AI can be used to enhance the result of user research and design evaluation. We then discuss how AI technology can be used to enhance HCI/UX design. Finally, we discuss how AI-enabled capabilities can improve UX when users interact with computing systems, applications, and services.

translated by 谷歌翻译

More is Better: A Database for Spontaneous Micro-Expression with High Frame Rates

Sirui Zhao , Huaying Tang , Xinglong Mao , Shifeng Liu , Hanqing Tao , Hao Wang , Tong Xu , Enhong Chen

分类：计算机视觉

2023-01-03

As one of the most important psychic stress reactions, micro-expressions (MEs), are spontaneous and transient facial expressions that can reveal the genuine emotions of human beings. Thus, recognizing MEs (MER) automatically is becoming increasingly crucial in the field of affective computing, and provides essential technical support in lie detection, psychological analysis and other areas. However, the lack of abundant ME data seriously restricts the development of cutting-edge data-driven MER models. Despite the recent efforts of several spontaneous ME datasets to alleviate this problem, it is still a tiny amount of work. To solve the problem of ME data hunger, we construct a dynamic spontaneous ME dataset with the largest current ME data scale, called DFME (Dynamic Facial Micro-expressions), which includes 7,526 well-labeled ME videos induced by 671 participants and annotated by more than 20 annotators throughout three years. Afterwards, we adopt four classical spatiotemporal feature learning models on DFME to perform MER experiments to objectively verify the validity of DFME dataset. In addition, we explore different solutions to the class imbalance and key-frame sequence sampling problems in dynamic MER respectively on DFME, so as to provide a valuable reference for future research. The comprehensive experimental results show that our DFME dataset can facilitate the research of automatic MER, and provide a new benchmark for MER. DFME will be published via https://mea-lab-421.github.io.

translated by 谷歌翻译